Search CORE

282 research outputs found

Measuring Sociality in Driving Interaction

Author: Sun Jian
Wang Meng
Zhao Xiaocong
Publication venue
Publication date: 25/10/2023
Field of study

Interacting with other human road users is one of the most challenging tasks for autonomous vehicles. For congruent driving behaviors, it is essential to recognize and comprehend sociality, encompassing both implicit social norms and individualized social preferences of human drivers. To understand and quantify the complex sociality in driving interactions, we propose a Virtual-Game-based Interaction Model (VGIM) that is parameterized by a social preference measurement, Interaction Preference Value (IPV). The IPV is designed to capture the driver's relative inclination towards individual rewards over group rewards. A method for identifying IPV from observed driving trajectory is also developed, with which we assessed human drivers' IPV using driving data recorded in a typical interactive driving scenario, the unprotected left turn. Our findings reveal that (1) human drivers exhibit particular social preference patterns while undertaking specific tasks, such as turning left or proceeding straight; (2) competitive actions could be strategically conducted by human drivers in order to coordinate with others. Finally, we discuss the potential of learning sociality-aware navigation from human demonstrations by incorporating a rule-based humanlike IPV expressing strategy into VGIM and optimization-based motion planners. Simulation experiments demonstrate that (1) IPV identification improves the motion prediction performance in interactive driving scenarios and (2) the dynamic IPV expressing strategy extracted from human driving data makes it possible to reproduce humanlike coordination patterns in the driving interaction

arXiv.org e-Print Archive

Chinese Text Recognition with A Pre-Trained CLIP-Like Model Through Image-IDS Aligning

Author: Li Bin
Wang Xiaocong
Xue Xiangyang
Yu Haiyang
Publication venue
Publication date: 03/09/2023
Field of study

Scene text recognition has been studied for decades due to its broad applications. However, despite Chinese characters possessing different characteristics from Latin characters, such as complex inner structures and large categories, few methods have been proposed for Chinese Text Recognition (CTR). Particularly, the characteristic of large categories poses challenges in dealing with zero-shot and few-shot Chinese characters. In this paper, inspired by the way humans recognize Chinese texts, we propose a two-stage framework for CTR. Firstly, we pre-train a CLIP-like model through aligning printed character images and Ideographic Description Sequences (IDS). This pre-training stage simulates humans recognizing Chinese characters and obtains the canonical representation of each character. Subsequently, the learned representations are employed to supervise the CTR model, such that traditional single-character recognition can be improved to text-line recognition through image-IDS matching. To evaluate the effectiveness of the proposed method, we conduct extensive experiments on both Chinese character recognition (CCR) and CTR. The experimental results demonstrate that the proposed method performs best in CCR and outperforms previous methods in most scenarios of the CTR benchmark. It is worth noting that the proposed method can recognize zero-shot Chinese characters in text images without fine-tuning, whereas previous methods require fine-tuning when new classes appear. The code is available at https://github.com/FudanVI/FudanOCR/tree/main/image-ids-CTR.Comment: ICCV 202

arXiv.org e-Print Archive

Orientation-Independent Chinese Text Recognition in Scene Images

Author: Li Bin
Wang Xiaocong
Xue Xiangyang
Yu Haiyang
Publication venue
Publication date: 03/09/2023
Field of study

Scene text recognition (STR) has attracted much attention due to its broad applications. The previous works pay more attention to dealing with the recognition of Latin text images with complex backgrounds by introducing language models or other auxiliary networks. Different from Latin texts, many vertical Chinese texts exist in natural scenes, which brings difficulties to current state-of-the-art STR methods. In this paper, we take the first attempt to extract orientation-independent visual features by disentangling content and orientation information of text images, thus recognizing both horizontal and vertical texts robustly in natural scenes. Specifically, we introduce a Character Image Reconstruction Network (CIRN) to recover corresponding printed character images with disentangled content and orientation information. We conduct experiments on a scene dataset for benchmarking Chinese text recognition, and the results demonstrate that the proposed method can indeed improve performance through disentangling content and orientation information. To further validate the effectiveness of our method, we additionally collect a Vertical Chinese Text Recognition (VCTR) dataset. The experimental results show that the proposed method achieves 45.63% improvement on VCTR when introducing CIRN to the baseline model.Comment: IJCAI 202

arXiv.org e-Print Archive

An Analytical Model for Predicting the Stress Distributions within Single-Lap Adhesively Bonded Beams

Author: Xiaocong He
Yuqi Wang
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2014
Field of study

An analytical model for predicting the stress distributions within single-lap adhesively bonded beams under tension is presented in this paper. By combining the governing equations of each adherend with the joint kinematics, the overall system of governing equations can be obtained. Both the adherends and the adhesive are assumed to be under plane strain condition. With suitable boundary conditions, the stress distribution of the adhesive in the longitudinal direction is determined

Crossref

Directory of Open Access Journals

On the Opportunities and Challenges of Offline Reinforcement Learning for Recommender Systems

Author: Chen Xiaocong
Jannach Dietmar
McAuley Julian
Wang Siyu
Yao Lina
Publication venue
Publication date: 22/08/2023
Field of study

Reinforcement learning serves as a potent tool for modeling dynamic user interests within recommender systems, garnering increasing research attention of late. However, a significant drawback persists: its poor data efficiency, stemming from its interactive nature. The training of reinforcement learning-based recommender systems demands expensive online interactions to amass adequate trajectories, essential for agents to learn user preferences. This inefficiency renders reinforcement learning-based recommender systems a formidable undertaking, necessitating the exploration of potential solutions. Recent strides in offline reinforcement learning present a new perspective. Offline reinforcement learning empowers agents to glean insights from offline datasets and deploy learned policies in online settings. Given that recommender systems possess extensive offline datasets, the framework of offline reinforcement learning aligns seamlessly. Despite being a burgeoning field, works centered on recommender systems utilizing offline reinforcement learning remain limited. This survey aims to introduce and delve into offline reinforcement learning within recommender systems, offering an inclusive review of existing literature in this domain. Furthermore, we strive to underscore prevalent challenges, opportunities, and future pathways, poised to propel research in this evolving field.Comment: under revie

arXiv.org e-Print Archive

Human Health Indicator Prediction from Gait Video

Author: Ji Xiangyang
Li Ziqing
Lian Xiaocong
Wang Yifeng
Yu Xuexin
Publication venue
Publication date: 25/12/2022
Field of study

Body Mass Index (BMI), age, height and weight are important indicators of human health conditions, which can provide useful information for plenty of practical purposes, such as health care, monitoring and re-identification. Most existing methods of health indicator prediction mainly use front-view body or face images. These inputs are hard to be obtained in daily life and often lead to the lack of robustness for the models, considering their strict requirements on view and pose. In this paper, we propose to employ gait videos to predict health indicators, which are more prevalent in surveillance and home monitoring scenarios. However, the study of health indicator prediction from gait videos using deep learning was hindered due to the small amount of open-sourced data. To address this issue, we analyse the similarity and relationship between pose estimation and health indicator prediction tasks, and then propose a paradigm enabling deep learning for small health indicator datasets by pre-training on the pose estimation task. Furthermore, to better suit the health indicator prediction task, we bring forward Global-Local Aware aNd Centrosymmetric Encoder (GLANCE) module. It first extracts local and global features by progressive convolutions and then fuses multi-level features by a centrosymmetric double-path hourglass structure in two different ways. Experiments demonstrate that the proposed paradigm achieves state-of-the-art results for predicting health indicators on MoVi, and that the GLANCE module is also beneficial for pose estimation on 3DPW

arXiv.org e-Print Archive